Integrating Shallow and Linguistic Techniques for Information Extraction from Text

نویسندگان

  • Fabio Ciravegna
  • Nicola Cancedda
چکیده

Many experiments have shown that traditional approaches to bothNatural Language Processing (NLP) and Information Retrieval (IR) are not eeective enough to extract information from text; as a matter of fact shallow techniques (such as statistics, keyword analysis, etc.) tend to be imprecise, although eecient and transportable, whereas linguistic approaches tend to be very precise but not robust and eecient. Integrating NLP and IR is the challenge for the evolution of text processing systems for the next few years. In this paper an architecture that integrates shallow and linguistic processing is presented. Shallow techniques are used to limit the linguistic analysis to the interesting sections, and to help the parser reduce the overhead. The linguistic analyzer carefully extracts the information, controlling the combinatorics of parsing and any misdirected parsing eeorts. Some preliminary results show that the architecture has considerable advantages with respect to traditional approaches to information extraction from text.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

Linguistic Processing of Texts Using Geppetto

We describe the linguistic analyzer of a prototype for Information Extraction from texts. Such analyzer uses information derived from a shallow processor to limit the computational cost of the analysis. At the same time, shallow techniques are used to collapse parse fragments when a complete parse is not possible. The linguistic analyzer has been built using GePpeTto, an environment that allows...

متن کامل

Integrating Balanced Scorecard with Fuzzy Linguistic and Fuzzy Delphi Method for Evaluating Performance of Team Sports (SANAT NAFT NOVIN Abadan Football Club)

<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...

متن کامل

Integrating Balanced Scorecard with Fuzzy Linguistic and Fuzzy Delphi Method for Evaluating Performance of Team Sports (SANAT NAFT NOVIN Abadan Football Club)

<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...

متن کامل

Extraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency

Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995